Evaluating Word Representation Features in Biomedical Named Entity Recognition Tasks

نویسندگان

  • Buzhou Tang
  • Hongxin Cao
  • Xiaolong Wang
  • Qingcai Chen
  • Hua Xu
چکیده

Biomedical Named Entity Recognition (BNER), which extracts important entities such as genes and proteins, is a crucial step of natural language processing in the biomedical domain. Various machine learning-based approaches have been applied to BNER tasks and showed good performance. In this paper, we systematically investigated three different types of word representation (WR) features for BNER, including clustering-based representation, distributional representation, and word embeddings. We selected one algorithm from each of the three types of WR features and applied them to the JNLPBA and BioCreAtIvE II BNER tasks. Our results showed that all the three WR algorithms were beneficial to machine learning-based BNER systems. Moreover, combining these different types of WR features further improved BNER performance, indicating that they are complementary to each other. By combining all the three types of WR features, the improvements in F-measure on the BioCreAtIvE II GM and JNLPBA corpora were 3.75% and 1.39%, respectively, when compared with the systems using baseline features. To the best of our knowledge, this is the first study to systematically evaluate the effect of three different types of WR features for BNER tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

سیستم شناسایی و طبقه‌بندی موجودیت‌های اسمی در متون زبان فارسی بر پایه شبکه عصبی

Named Entity Recognition (NER) is a fundamental task in natural language processing and also known as a subset of information extraction. We seek to locate and classify named entities in text into predefined categories such as the names of persons, organizations, locations, expressions of times, etc. Named Entity Recognition for English texts has been researched widely for the past years, howev...

متن کامل

BANNER-CHEMDNER: Incorporating Domain Knowledge in Chemical and Drug Named Entity Recognition

Exploiting unlabeled text data to leverage the system performance has been an active and challenging research topic in text mining, due to the recent growth of the amount of biomedical literature. Named entity recognition is an essential prerequisite task before effective text mining of biomedical literature can begin. The participants of the CHEMDNER task of the BioCreative IV challenge are as...

متن کامل

بهبود شناسایی موجودیت‌های نامدار فارسی با استفاده از کسره اضافه

Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...

متن کامل

Biomedical Named Entity Recognition Based on Feature Selection and Word Representations

As an important task in biomedical text mining, biomedical named entity recognition (Bio-NER) has increasingly attracted researchers attention. Various methods have been employed to solve this problem and achieved desirable results on the annotated datasets. In this work, we focus on the feature set to reduce the training cost by feature selection and template optimization. Also, we integrate t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2014  شماره 

صفحات  -

تاریخ انتشار 2014